Method and system for limiting the impact of undesirable behavior of computers on a shared data network

ABSTRACT

Undesirable behavior patterns of computers on a network impact network performance. A system and method are provided for limiting the impact of undesirable behavior of computers on the network. The network, through which packets of data are interchanged between the computers, includes one or more forwarding devices that are controlled or instructed by one or more packet traffic monitors. Each of the packet traffic monitors is configured for monitoring the packets; for determining if the information about the pattern of behavior from any of the computers is trustworthy; for determining, upon discovering that one or more of the patterns of behavior is undesirable, a type of the undesirable pattern behavior; and for determining a proper action for mitigating that type of undesirable behavior. The proper action is performed by mitigation means controlling the one or more forwarding devices.

REFERENCE TO PRIOR APPLICATION

A claim is hereby made for the benefit under 35 U.S.C. 119(e) of U.S.Provisional Application Serial No. 60/252,821, filed Nov. 22, 2000,titled “Method and System for Limiting the Impact of UndesirableBehavior of Computers on a Shared Data Network,” which is incorporatedherein by reference.

FIELD OF THE INVENTION

The present invention is generally related to computer networks and,specifically, to switch, router or bridge-based networks.

BACKGROUND OF THE INVENTION

Computer networks connect and provide a data communications servicebetween addressable devices (or nodes), including computers, servers,printers and the like. Computer networks are generally classifiedaccording to their geographical extent as local area networks (LANs),metropolitan area networks (MANs) and wide area networks (WANs). Thepresent invention is related to and can be implemented in any of thesenetwork classes.

Ethernet is one technology of choice upon which data networks are built.Ethernet is typically characterized as a multi-access packet-switchedcommunications network for carrying data among locally distributedcomputers. The shared-communications channel in Ethernet is a passivebroadcast medium with no packet address recognition or central control.The passive broadcast medium forms a backbone of the Ethernet networkand a transmission medium that is shared between two or more addressabledevices. A LAN in the Ethernet network is a network segment that coversa relatively small geographic area. Although LANs offer a high-speedcommunications and data sharing service, LANs have basic limitationssuch as the number of addressable devices, bandwidth and physicalextent. By comparison, MANs and WANs offer a greater physical extent andlarger number of addressable devices but slower communications speed.

To extend the benefits of a LAN beyond its basic limitations, forwardingdevices (referred to also as stations) such as switching, routing orbridging devices are often used to form an extended network. Forwardingdevices are multi-port addressable devices interposed between any numberof LAN backbones or between LAN backbones and the long distancebackbones of a MAN or WAN.

In making data traffic forwarding decisions, these devices use uniqueidentifiers (UIDs) of the computers (also referred to as hosts or endstations). Specifically, computers communicate by sending and receivingpackets or groups of packets that in addition to payload include MAC(media access control) addresses (or UID) of the source and destinationcomputers (the MAC address is considered a low-level address as comparedto an internet protocol, IP, address).

In forwarding data traffic, the forwarding devices distinguish packetsby their destination address type. For example, a unicast packet is apacket with a particular host address as its destination. A packet thatis sent to a group of hosts is a multicast packet. In this case, thepacket includes a group address UID as its destination. A group addressdedicated to the group of all hosts is a broadcast address and amulticast packet addressed to all hosts is a broadcast packet. One typeof broadcast packet, known as an ARP (address resolution protocol)request, is sent for requesting the Ethernet address (UID or MACaddress) of a host in the network. The ARP request contains the IPaddress of the host to be queried and that host, upon recognizing the IPaddress as it own returns a MAC address answer. ARP is the protocol usedto map IP addresses to MAC (Ethernet) addresses for transport of datatraffic from the Internet to hosts via the local network (Ethernetsegment).

In providing the foregoing data communications service, networks can bedistinguished based on the type of forwarding device(s) they include.Forwarding devices commonly used in networks include hubs, repeaters,switches, bridges and routers. A repeater is physical layer (layer 1)device used to interconnect the conductor segments of an extendednetwork and enables them to be treated as a single conductor. Therepeater amplifies and restores the timing margins of packet bitstreams, but it does not use addressing for packets forwarding. A hub isa physical layer device that connects multiple hosts via dedicatedconductor, and in some respects it functions as a multi-port repeater.The hub receives a packet in one port and re-transmits it to all of itsother ports. However, in a shared media comprising a hub-connectedEthernet segment all hosts are competing for a limited amount ofbandwidth.

A switch is also a physical layer device although more intelligent thanthe hub. A switch is a multi-port device designed with logic for knowingto which port of the switch each device (e.g., host or another switch)is connected. The switch isolates each port and makes it appear that thenetwork attachment to that port is the only one. Any data received atone of the ports is then switched, using the logic in the switch, to aspecific destination port. The switch will flood packets to every portif it is not sure where the destination of such packets is or if thedestination address in the packets is a broadcast address. Since theswitch operates at the physical layer it switches in hardware. Thus, inextended networks, this faster throughput and higher port density makeswitching technology a more dominant complement to routing thanbridging.

A bridge is a data layer (layer 2) device that switches in software, andit is concerned with addresses of network devices and not the actualpaths between them. The bridge enables devices on different LAN segmentsto communicate with each other as if they were on the same switch orhub, and it interconnects LANs of unlike bandwidth. The Bridge canfilter packets based on data-layer information contained within the MACaddress, protocol, etc. Moreover, the bridge will flood packets to everyhost in the topology network if it is not sure where the destination ofsuch packets is or if the destination address in the packets is abroadcast address. Thus, bridges propagate ARP request broadcasts likeany other Ethernet broadcast and transparently bridge (forward) the ARPanswers. Bridges respond to ARP requests for hosts known to them or,alternatively, they send their own ARP requests on the network. NotablyARP requests are transparent to bridging but not to routers. In abridge-based network, when the one or more bridges forward packets byflooding or forward broadcast traffic (including ARP requests), thebandwidth of the network is limited to the bandwidth of a single LAN.This limitation is present even with richly connected network segments,especially since redundant connections are inactive standby connections.

By comparison, in a switch-based network switches are faster butswitches do little to restrict passage of broadcast traffic in thenetwork. Broadcast traffic is not restricted in a switch based networksince switches will flood packets to every port if they are not surewhere the destination of such packets is or if the destination addressin the packets is a broadcast address. Generally, a switch-based networkas shown in FIG. 1 is characterized in that it does not discard anypackets except during reconfiguration of the network. FIG. 1 illustratesa switch-based network 10 where the forwarding devices (switches) 112are interconnected in an arbitrary topology. Their larger scale makesswitch-based networks particularly vulnerable to common networkpathologies including broadcast storms, ARP fights, stolen MAC addressesor any other undesirable behavior. Such pathologies exist intraditional, shared broadcast media, but are more relevant inswitch-based networks because of their large scale and modem pressurefor Internet addresses. And, they may happen either by accident orthrough malice by rogue computers.

ARP fights occur when two hosts with different MAC (layer 2 hardware)addresses conflict for the same IP address. ARP fights occur for exampleas a result of misconfiguration or buggy implementations of DHCP(dynamic host configuration protocol) which is a protocol fordynamically allocating IP addresses to computers on a LAN.

A stolen Ethernet (MAC) address situation occurs when two IP addressesmap to the same MAC address. ARP is not suited for resolving conflictingresponses, and it could be used by an unruly host in a man-in-the-middleattack. Such attack is characterized in that the unruly host illegallyintercepts the ARP request communications and adopts other hosts' MACaddresses.

Broadcast storms occur when a buggy or malevolent host emits acontinuous stream of broadcast packets. With the emission of a broadcaststorm, even a single host can impose a limit on the practical size of anetwork through consumption of too many network resources. For example,a single host can execute a denial-of-service attack on all other hostson the same subnet. In larger networks, disruptive behavior, includingbehavior akin to broadcast storms, can be frequently exhibited simplybecause there are numerous hosts.

In order to isolate broadcast storms, a switch-based network (orbridge-based network) can be broken into discrete broadcast domains,referred to a virtual LANs (VLANs), which are connected through routers.A router is a network layer (layer 3) device that uses networkaddressing and a routing protocol in forwarding packets. Unlike abridge, the router is concerned with the paths between devices. Therouter analyzes the addresses of all packet traffic coming in throughits ports and if the traffic is not local, the router sends the trafficout through one of its other ports. Thus, when a host sends a packet toa router it sends that packet addressed to the router's physical layer(MAC) address with the network layer (protocol) address of thedestination host. As it examines the destination hosts protocol addressthe router determines that it either knows or doesn't know how toforward the packet to the next hop (router). If the router knows thenext hop, it changes the MAC address to that of the next hop andforwards the packet to that hop; and, alternatively, if it knows thedestination address the router forwards the packet to the destinationhost. As mentioned before, ARP requests are transparent to bridging butnot to routers. Routers do not propagate the ARP request broadcastsbecause routers are network level (3) devices, and Ethernet, Token-Ring,FDDI (fiber distributed data interface) and ATM (asynchronous transfermode) are data-link protocols (data layer (2) protocols). Forpropagating a packet, the host must first use its routing protocols toselect the proper router (i.e., the proper IP address of the properrouter) that can be reached via Ethernet ARPs. The proper routerresponds to an ARP request containing its IP address with its MAC(Ethernet) address. Then, the packet is transmitted to the MAC addressof the router through which it is re-transmitted toward its actualdestination.

To improve throughput performance, many scaled networks utilize Ethernetswitches (e.g., Gigabyte Ethernet switches) between routers in a routedbackbone. Switch-based Ethernet networks that are scaled through routersisolate the broadcast domains and are able, in turn, to isolate trafficbetween different pairs of hosts for performance and security. Moreover,their aggregate bandwidth allows switched networks to scale larger thanbroadcast networks using hubs. However, routers are inherently slowerbecause of the added processing they do in packet analysis. Namely,routers introduce bottlenecks in data traffic. And routers do not solveothers of the above described network pathologies.

Accordingly, there remains a need to address network communicationsproblems. To that end, the present invention provides solutions thataddress the above-mentioned pathologies.

SUMMARY OF THE INVENTION

The solution proposed by the present invention can be implemented as amethod, system, device, computer product or computer program module.Preferably, the present invention contemplates a solution that includesusing a packet traffic monitor that can observe packets in one or moreplaces on the network. The packet traffic monitor may be built intonetwork components such as switches or hosts, or it may be built as aseparate device. When the packet traffic monitor is attached to aforwarding component (e.g., switch), and not to the hosts it monitors,it cannot simply shutdown those hosts. Hence the packet traffic monitorrelies on interrogating and influencing packet forwarding decisions ofthe switches (or other forwarding devices). The packet traffic monitoris configured, or programmed, to recognize undesirable packet trafficpatterns and to instruct appropriate switches to discard packets orisolate offending hosts when an undesirable pattern is detected.

Although the invention contemplates using the packet traffic monitor todetect any types of behavior and undesirable patterns of packet traffic,the packet traffic monitor is preferably expected to detect at leastpathologies that are listed below. The pathologies (or network faults)of greater interest include: (1) broadcast storms—overuse, orinappropriate use of the broadcast address or multicast addresses; (2)stolen IP address—use of an IP address by two machines simultaneously.The monitor is expected to detect this situation, and from previousobservation of the network to guess which host normally uses theaddress; (3) stolen MAC address—use of a MAC address by two machinessimultaneously; and (4) malformed packets. It is noted that the packettraffic monitor can be configured to look for a wide-range of behaviorsthat an administrator might consider undesirable although notnecessarily network “faults.”

Which patterns can be observed will depend on where in the network thepacket traffic monitor is placed. To detect a broadcast storm, themonitor could be placed anywhere in the network (except perhaps at alow-bandwidth links). To detect other packet traffic patterns themonitor might have to be placed in strategically located points. Forexample, to detect overuse or abuse of a given server, the monitor wouldhave to be able to see packets arriving at that server.

The invention causes the undesirable packets in the network not to beforwarded through the network. For example, in the case of broadcaststorms, the broadcast packets would not be forwarded for exponentiallylonger periods of time as the offending host continues to try to sendthem. Although the host may still flood its local segment it cannotflood the rest of the network.

With the use of the packet traffic monitor, the invention canadvantageously limit the damage caused by overuse of broadcast packetswithout needing to use routers for this purpose, and without preventingacceptable broadcast packet traffic. The invention can limit the harmdone by a host to the network segment containing the host, rather thanto the (possibly much larger) subnet.

In accordance with the purpose of the present invention as embodied andbroadly described herein, each of the packet traffic monitors can beconfigured for monitoring the network for patterns of packet trafficbehavior and for determining if the information about a particularpattern of behavior from any of the computers is trustworthy. It isfurther configured for determining, upon discovering that one or more ofthe patterns of behavior is undesirable, the type of the undesirablepattern behavior; and determining a proper action for mitigating thattype of undesirable behavior. The proper action is performed bymitigation means that can control the forwarding devices.

Advantages of the present invention will be set forth, in part, in thedescription herein and, in part, will be understood by those skilled inthe art from the description herein. The advantages of the invention canbe realized and attained by means of the elements and combinationsparticularly pointed out in the appended claims and equivalents.

BRIEF DESCRIPTION OF DRAWINGS

The accompanying drawings, which are incorporated in and constitute apart of this specification, illustrate several embodiments of theinvention, and, together with the description, serve to explain theprinciples of the invention. Wherever convenient, the same referencenumbers will be used to refer to the same or like elements throughoutthe drawings, in which:

FIG. 1 illustrates a switch-based network.

FIG. 2-4 illustrates a switch-based network with one or more than onepacket traffic monitor.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides a solution intended for limiting theimpact of detrimental behavior of computers interconnected via a shareddata network. The solution may be implemented in a method, system,device, computer program product or computer program module. Theinvention introduces to the network one or more packet traffic monitorsconfigured to detect undesirable behavior and to mitigate its effects. Apacket traffic monitor in accordance with the present invention woulddetect this behavior automatically

To enable one of ordinary skill in the art to make and use theinvention, the description of the invention is presented herein in thecontext of a patent application and its requirements. Although theinvention will be described in accordance with the shown embodiments,one of ordinary skill in the art will readily recognize that there couldbe variations to the embodiments and those variations would be withinthe scope and spirit of the invention.

A primary goal of the invention is to limit the impact of undesirablebehavior of computers on a shared data network through which packets ofdata are passing to all its computers. To that end the inventioncontemplates monitoring the network for any pattern of packet trafficbehavior that can be considered undesirable. A pattern of behavior canbe identified as a sample or something which is used as an example ofbehavior, a recognizable way in which something is done, organized orhappens, or any regularly repeated arrangement of data, signals etc.Moreover, a pattern of behavior can include a single packet of data orany other number of packets or groups of packets. Undesirable oroffending behavior can include any behavior that is designated by anetwork administrator as notable or, more specifically, unwanted,detrimental or unwelcome. Furthermore, even if not specificallydesignated by a network administrator as notable or undesirable, apattern of behavior can be considered undesirable any time it isunwanted, detrimental or unwelcome. Examples of such undesirablebehavior are the common network pathologies (or LAN pathologies) asoutlined above and as will be further mentioned below.

In dealing with recognition of undesirable behavior, the invention isdesigned to rely on an understanding of the network topology (be itswitched Ethernet network, bridged Ethernet network, scaled switch-basednetwork or the like). This understanding is gained through a discoveryof the network topology.

Topology Discovery

Understanding the network topology is important for a number of reasons.One reason for understanding network topology is that it helps indeciding which port (or communication channel) should be disabled whenundesirable packet traffic is detected. Another reason is that itfacilitates identification of offending computers' locations (‘bad’hosts locations). To that end, the packet traffic monitor discovers thetopology of the network, and this discovery is dependent in part on thetype of network (i.e., the type of forwarding devices the networkincludes). Network topology, defines the manner in which network devicesare organized, although network topology defines a logical architectureas compared to the actual physical architecture of the network. Tobetter understand the discovery aspect, it is worth describing in thefirst place how the network topology is obtained in various types ofnetworks, i.e., router-based, bridge-based and switch-based networks.

In router-based networks, routing involves two basic activities:determining optimal routing paths and switching. To aid the process ofoptimal paths-determination, specific routing algorithms initialize andmaintain routing tables that based on the network topology contain theshortest paths to reachable networks. These tables contain an entry foreach network that can be reached from the router and they provide thebasis for routing decisions. Routers exchange routing informationperiodically to keep their routing tables current so that at any pointin time any router knows about any other router in the network topology.The tables contain for example all destinations advertised byneighboring routers where each entry includes the destination addressand a list of neighbors that advertised that destination address. Foreach neighbor, the entry records the advertised metric (e.g., pathlength) which the neighbor stores in its routing table. Thus, arouter-based network adapts to topology changes, and having recorded thecurrent network topology information, each router can have all theinformation it needs to make the routing decisions. Then, a packettraffic monitor can poll or interrogate that information in gaining itsunderstanding of the network topology.

A bridged Ethernet network is another network that adapts to networktopology changes in that it provides for configuration andre-configuration of forwarding tables. Bridges learn the networktopology based on the source MAC address and forward packets based onthe destination MAC address. The bridged Ethernet network maintains aloop-free forwarding tree (spanning tree), although as compared to theswitched Ethernet network, a bridged Ethernet network supportsmultiple-access links and it reconfigures slower. Learning, forwardingand filtering, refined by a spanning tree algorithm, comprise the basicfunctionality of a bridge. Learning, filtering and forwarding, rely onthe existence of a single path between any two devices on the network.The spanning tree algorithm is the means by which bridges can eliminateloops in the network topology. To ensure a single path between any twodevices, the spanning tree algorithm constructs the spanning tree by aseries of bridge-to-bridge negotiations, where the spanning treerepresents any unique device-to-device path in the network. The spanningtree is then used in forwarding packets. For example, upon receipt of abroadcast packet, bridges forward the packet up the spanning tree to theroot-bridge and then flood the packet down the spanning tree to all thehosts (destinations). Bridges propagate ARP request broadcasts like anyother Ethernet broadcast and transparently bridge (forward) the answers.Once the spanning tree is constructed based on the current topology ofthe network the packet traffic monitor can poll or interrogate thebridges in order to gain an understanding of the network topology.

It is noted that although the present invention can be implemented inthe foregoing types of networks it is preferably implemented in aswitch-based network. A scaled LAN composed of crossbar switchesinterconnected by full-duplex links is an example of a switch-basednetwork in which ports of switches can be directly connected to oneanother and to hosts in an arbitrary topology. The switched Ethernetnetwork has an implicit addressing structure induced by thepoint-to-point links, where the next hop of a packet is known. Softwarein the crossbar switches, builds packet forwarding (routing) tables andrebuilds them whenever switches or links fail or recover or as switchesand links are added or removed. Whenever the topology changes, theswitches determine the new topology and update the forwarding tables.The forwarding tables map each MAC address to an output port of theswitch. Since the switched Ethernet network has an arbitrary topology,in order to avoid possible deadlocks in routing packets througharbitrary paths, the routing algorithms can restrict the paths to a setof deadlock-free paths based on a loop-free assignment of direction tooperational links. In a graph representation of the switched Ethernetnetwork topology, the hosts and switches are vertices and links betweenthem are edges. The graph is formed as a spanning tree constructed froma specific root where each link is assigned a direction such that thedirected links do not form loops. Routing paths are determineddynamically by the switches as packets pass from switch to switch (i.e.from point to point). In this example, the forwarding tables are builtdynamically in process that involves monitoring, topology acquisition,and routing. Monitoring determines which links are useful for carryingpackets from one switch to anther. Topology acquisition discovers thenetwork topology and delivers that description of it to every switch.Routing uses the topology description to compute the forwarding tablesfor each switch.

Since the switched network topology takes the form of a tree, either byphysical connection or by use of the spanning tree protocol, topologydiscovery is the process of recovering that tree from the switches. (Byanalogy, in a router-based scaled, switched Ethernet network theforwarding tables with next hop IP addresses would be obtained from therespective routers.) To that end, the packet traffic monitor's uses anetwork management protocol interface. As an example, understanding ofnetwork topology can be gained by utilizing standard SNMP (simplenetwork management protocol) interface for network hardware (components)management. It is noted that SNMP is a widely used network managementprotocol, although other standard or proprietary network managementprotocols can be suitably used for this purpose. Additional informationcan be used, including, optionally, from a remote network monitor thataccumulates historical data traffic statistics for a network segment.

Detecting and Mitigating Undesirable Behavior

Upon discovering the topology of the network (be it switched Ethernetnetwork or, by analogy, the topology of the bridged Ethernet network,router-based scaled, switched Ethernet network or the like) and uponlearning about the capacity of the network components the packet monitorcan observe the network. For example, in a switched Ethernet network,the packet traffic monitor learns about the per-port ingress packetcounters in the switches and it can poll such counters to observe thenumber of broadcast packets.

In accordance with its intended purpose, the invention envisions usingthe packet traffic monitor to determine the existence and source of anypattern of undesirable behavior, including network pathologies such asbroadcast storms or ARP fights, and to limit the effects of suchbehavior. When the packet traffic monitor detects undesirable behavior,including overuse or misuse of the network, as one measure, the monitortakes steps to mitigate this behavior. For example, the packet trafficmonitor disables the offending network segment to isolate the offendinghosts from the rest of the hosts in the network, or at least from thehosts they are disrupting. Preferably, the invention uses the packettraffic monitor to direct one or more switches (or other forwardingdevices such as bridges or smart bridges) to cease forwardingundesirable data traffic.

Thus, upon detecting for example a broadcast storm, the packet trafficmonitor mitigates such undesirable behavior pattern by instructing asmany switches in the network as possible to stop forwarding thosebroadcast packets, or perhaps any packets from the offending host. Thiswould allow construction of a large network that would normally allowbroadcasts to propagate over the entire network, but which would recoverfrom hosts sending too many broadcast packets. This helps solve aserious problem in conventional networks—especially extended LANsconstructed from many Ethernet segments and bridges.

As mentioned above, scaled networks can be constructed using routers toconnect the VLANs or “subnets.” As further mentioned, each subnet in therouter-based scaled networks can be an isolated domain for packetpropagation, where broadcast packets are directed by switches only tothe hosts within the same subnet. This limits the damage that can bedone by a host that sends too many broadcast packets. This property ofrouters is often quoted as a reason for using routers instead ofconstructing a large subnet using only switches or bridges. However, itis not desirable to use routers in this way. One reason is that settingup router parameters and tables can be significantly burdensome to anadministrator. And since a router needs to allow (or prevent) broadcastpackets to reach all the hosts, it is likewise burdensome to have toworry about which subnet a given host is in. Accordingly, although thepresent invention can be implemented in a router-based network thepresent invention contemplates preferred solutions at the switch level(in shared data networks).

In a switched Ethernet network, performance anomalies as outlined abovecan be addressed by influencing the forwarding scheme in the switches.(By analogy, anomalies in other types of networks, e.g., bridgednetworked, are addressed by influencing the forwarding schemes of theirrespective forwarding devices). One way of influencing the forwardingscheme is discarding offending packets. Another way is to reconfigurethe forwarding table or to adjust the routing table in order to stop‘bad’ packets. Another way of influencing the forwarding scheme isturning off ports, which will isolate all hosts in the network segment,including non-offending hosts. Yet another way of influencing theforwarding scheme is filtering source and destination IP addresses.However, the preferred way of influencing the forwarding scheme isfiltering source MAC addresses. As a result, a particular offending hostor segment can be selectively isolated for certain time periods.

Once forwarding is disabled, the disablement and recovery time intervalcan be controlled. This measure is in many ways similar in character tothe skepticism level and recovery time combination as described in U.S.Pat. No. 5,260,945, issued Nov. 9, 1993, by Thomas Lee Rodeheffer,titled “Intermittent Component Failure Manager and Method for MinimizingDisruption of Distributed Computer System,” which is incorporated hereinby reference (hereafter “Rodeheffer”). Although, unlike the presentinvention, Rodeheffer's approach is directed to failure management thatresponds to component or link failures or intermittent failures, thebasic idea of skepticism and recovery time control (as will be laterdescribed) is adopted by the present invention. Other measures are notprecluded although this approach has been shown to produce good results.

In general, a skeptic is used when a fault monitor, separate or integralto the skeptic, recognizes a “broken” component or connectivity (orlink). Upon receiving a fault indication, the skeptic enters a waitstate before it lets such component or connectivity to recover, i.e.,rejoin the network and prompt reconfiguration of the network topologygraph, after it starts working again. When a broken component (e.g.,host) is detected, that component is taken out of operation forsuccessively longer periods in a random exponential backoff before anattempt is made to use it once more. The monitor reduces the backoffexponent by one (or other value) if the component is put into serviceand does not fail again for the current backoff time. Conversely, thebackoff time is increased if the component breaks again. Thus, often orintermittently broken components are “removed” from the network forprogressively longer periods of time, and “repaired” componentseventually “forget” their failed history. Namely, a broken componentwith a long history of failure will be allowed to recover after aprogressively longer wait period and more severe penalty, as comparedwith the progressively decreasing wait period and penalty imposed on abroken component with a ‘good’ history. In one embodiment, the goodhistory can be classified as skepticism level zero (0). Failure cyclesin greater numbers increase the skepticism level accordingly. In otherwords, the skepticism level determines the recovery wait period.

By contrast to Rodeheffer's scheme of link and component failuremanagement as described above, the present invention uses the packettraffic monitor to manage network pathologies. In the most general case,the invention is configured to monitor any undesirable behavior patternsincluding broadcast storms and ARP fights, stolen MAC addresses,malformed packets, etc. Through heuristics, the packet traffic monitorcan detect behavior patterns of any kind; and this detection isautomatic. For example, the monitor can detect too many packets destinedto an overloaded server, too many probe packets directed to a firewallor too many ARP request packets. As a further example, the packettraffic monitor can detect packets arriving in response to ARP requestswith more than one packet having a similar MAC addresses, or packetsarriving from hosts that use a similar IP address. Upon detecting suchbehavior the packet traffic monitor can cause an offending host to beisolated from the network either directly or indirectly, as will belater explained.

Indeed, on detecting an offending packet or an undesirable pattern ofpackets, the packet traffic monitor may react in the same manner as theaforementioned skeptic in Rodeheffer's fault monitor. Namely, the packettraffic monitor could treat the undesirable packet traffic pattern orbehavior as a “fault”, and the originating host of the packets as a“faulty component”. Then, as explained above, the packet traffic monitorwill isolate the faulty host (or stop forwarding its packets) forexponentially increasing time periods while the undesirable behaviorcontinues or repeats. It is noted that the packet traffic monitor can beconfigured to look for a wide-range of behaviors that an administratormight consider undesirable although such behaviors are not necessarilyconsidered network “faults.” These aspects of the invention are designedto alert the system administrator, preferably via electronic mail, andto take action in the case of broadcast storms.

Broadcast Storms

To recap, broadcast storms occur when a buggy or malevolent host emits acontinuous stream of broadcast packets. Repeated broadcast packets areconsidered broadcast storms when certainnetwork-administration-policy-dependent conditions apply. For example, anetwork administration policy may set forth that 10% of the networkbandwidth can be consumed by broadcast packets. The bandwidth is that ofthe network's lowest-speed link (or segment). The packet traffic monitorcan thus determine that any use beyond this limit is a broadcast storm.

In one embodiment, the packet traffic monitor observes the network andthereby detects and localizes all broadcast packets traffic. Observingmore than a predetermined number of broadcast packets within apredetermined time period implies that a broadcast storm is underway. Itis likely that the packet is correctly addressed, and that knowing thesource MAC address and the network topology will point to a particularport of a forwarding device, e.g., switch port, to be disabled. Inanother embodiment, the per-port broadcast ingress packet counters canbe used to trace broadcast packets to their source. This approach isused if the packet traffic monitor fails at determining the source,possibly because of incorrectly formatted packets or because themisbehaving host has not been seen on the network before (unknown MACaddress). This detection approach is less timely than the prior approachsince the process of retrieving these counters from the switch isextensive and it cannot be executed often.

Once existence of a broadcast storm is detected, warranting action, theport associated with the offending host is disabled. The port will bere-enabled after the passage of an interval, which doubles each time thesource port is disabled (i.e., exponentially increasing time period).

Stolen MAC Address

A stolen MAC address situation results from the use of the same MACaddress by two hosts with different IP addresses. When the simultaneoususe of a MAC address is observed, having gained an understanding of thenetwork topology, the packet traffic monitor can choose one of theconflicting hosts to stay in the network and disable others.

ARP Fights

The packet traffic monitor watches for ARP requests, which are, bynature, broadcast to all hosts. The packet traffic monitor can detectexcess ARP requests by finding more that a preset number of ARP requestsduring a predetermined period of time interval, say 5 minutes. Indealing with ARP fights, the invention monitors the broadcast trafficassociated with ARP requests, and verifies the stability and lack ofconflict in the IP to MAC address mapping. The packet traffic monitorqueues both the source and destination IP addresses from the ARP requestfor verification. This queue is used to reduce the overall traffic loadimposed on the network by the packet traffic monitor. Every second (orother suitable time period), the packet traffic monitor chooses the nextIP, and sends an ARP request. If two conflicting responses are received,then two machines have decided to use the same IP address. Then thepacket traffic monitor can notify the network administrator byelectronic mail.

A packet traffic monitor recognizing an ARP fight is required todetermine which IP address is the one correctly associated with that MACaddress. A host presently connected in the network (and presumablyincluded in its topology graph) has a known MAC address and a knowncorresponding IP address. Therefore, it would seem that the received MACaddress belongs to the host more likely associated with the correct IPif the host is already present in the network. Based on that, the packettraffic monitor preferably compares the received MAC against all knownMAC address in order to find a match and correspondingly to find thecorrect IP address of the appropriate host. Alternatively, the packettraffic monitor assumes that a host that initiated an ARP request wasoriginally on the network and its corresponding IP address is thecorrect one to use.

Notably, the preferred scheme involves detecting undesirable packetsbecause, short of self-policing hosts, this detection is what instructsforwarding devices such as switches in making their forwardingdecisions. Detecting undesirable packets enables the packet trafficmonitor to instruct switches to cease forwarding the undesirable packetsfrom offending hosts thereby, indirectly, isolating these hosts from thenetwork.

A policy question is whether the host having the IP address for thelongest time should be entitled to continue using it. On the other hand,this IP address might be an address previously allocated by DHCP(dynamic host configuration protocol), for which the host (for anynumber of reasons) has not properly renewed the lease. Because of thispredicament the solution may be limited to notifying the administrator.However, there are some interesting possibilities to be considered.

One possibility arises from the fact that an Internet SoftwareConsortium (ISC) DHCP server includes a flat text file containing its IPto MAC address mappings. Any MAC address that contradicts this listwould be disabled. Ideally, the DHCP protocol would include a provisionfor such verification. Another possibility arises from the fact thatsome institutions keep a list of MAC addresses that are allowed toobtain an IP address via DHCP. This list includes the user who owns themachine, which would make it particularly easy to notify the partiesinvolved.

Placement of the Packet Traffic Monitor

The effectiveness level of the foregoing approaches depends on theplacement of the packet traffic monitor. Likewise, placement of thepacket traffic monitor determines the type of network pathology it canbetter address. The more strategic the location the better the result.

In the case of overuse or abuse of a given server, the monitor wouldhave to be placed such that it is able to monitor packets arriving atthat server. If a packet traffic monitor were able to shut off or filterthe stream of all packets sufficiently early it would be useful incounteracting attacks such as distributed denial-of-service attacks.Distributed denial-of-service attacks occur where one or a group ofmalicious hosts send packets in such large numbers that they impose asignificant load on their target and prevent other hosts from reachingthat target. Ideally, for such attacks each host should have a dedicatedpacket traffic monitor.

If the network is a mixture of high-speed and low-speed segments, themonitor should be placed in the high-speed link to detect broadcaststorms more reliably. If, instead, the packet traffic monitor were to beplaced in a low-speed link, a broadcast storm would flood that linkbefore that monitor would have a chance to send its packet to control(or instruct) the switch. In any case, packet traffic monitor packetswill preferably have a higher priority over regular packets.

Accordingly various placement schemes are possible in which one or morepacket traffic monitors can be strategically placed in the network.Moreover, it is possible that, at any given time, several packet trafficmonitors are simultaneously in use on a single network. FIGS. 2-4provide examples of packet traffic monitor placement in a switch-basedshared data network.

In the network 100 of FIG. 2, the packet traffic monitor is an integralpart of one or more of the switches 114. In the network 200 of FIG. 3,the packet traffic monitor 118 is a device distinct from but connectedto one or more of the switches 112. The effectiveness of the packettraffic monitor in isolating offending hosts increases with its abilityto monitor greater number of communication paths and with its ability toinstruct (or control) a greater number of switches. Internally, this maynecessitate means for monitoring a plurality of paths or, instead, aplurality of devices each for monitoring a path. Alternatively, a packettraffic monitor is more effective if it can control the majority (or alarger number) of the switches. Typically, networks are configured withone type of switches, and this uniformity makes the packet trafficmonitor easier to configure for communication with the switches.

FIG. 4 illustrates a network 300 in which each host 110 is self-policingwith its dedicated packet traffic monitor 116 (internal or external tothe host). Alternatively (not shown), there could be a device (orsoftware module) in each host that is operatively cooperative with anexternal packet traffic monitor in that such device gathers informationabout and allows the monitor to control the host.

Thus, there is a continuum extending between two extremes. At oneextreme, the packet traffic monitor is present in or associated witheach of the hosts that, in addition, are cooperative with it. At theother, less desirable extreme, there is a single packet traffic monitorin the network and no cooperation from any of the hosts. Even with acondition similar to the less desirable extreme condition, in ahypothetical Ethernet segment with one switch and one host (or oneswitch per host), broadcast storms from/to the host can be readilystopped. By comparison, if multiple hosts are connected to one switch onthe Ethernet segment, the switch may isolate the segment from othersegments but it will not be able to isolate hosts within that segmentfrom broadcast storms.

FIGS. 2-4 show the packet traffic monitors as a device or asub-component of other devices. Indeed, the packet traffic monitor couldbe implemented as a hardware module built into network components suchas switches or hosts, or it may be built as a separate device. It isnoted however, that the preferred implementation of the packet trafficmonitor is a computer program (or software) module. And this softwaremodule can be configured into the system software of a separate device,a switch, a host, etc. The packet traffic monitor software module can beadded to the existing system software and is likely to be a privilegedapplication. Moreover, this software module can be, but it is notrequired to be, a part of the operating system.

In one embodiment the present invention envisions a monitor thatobserves packets traffic at various points on the network by promptingthe switches or hosts to forward packets to the monitor. Alternatively,various points on the networks can be randomly or selectively “sampled”rather than being exhaustively monitored, where a representativesampling of the packets is obtained rather than all of them. Again, thissampling might be implemented inside switches or hosts. Preferably, thesampling of packets is random, although other approaches are possible.For example, packets may be sampled during certain time intervals or anyother selective manner.

The invention additionally contemplate that in one embodiment the packettraffic monitor will react selectively to packets or higher-levelinformation it receives about packets from components such as hosts. Tothat end, it is further envisioned that hosts and other components arecapable of observing and delivering such information to the packettraffic monitor. Then, the packet traffic monitor may choose to act ornot to act in response to such information. For example, a host may beable to detect an undesirable pattern of packet traffic that may not beobvious by low-level observations of packets (such as e-mail spam fromanother host). That host may be able to send the information about theobserved packets to the packet traffic monitor. Then, the packet trafficmonitor may use filters and configuration parameters to decide whetherthe information it is receiving is likely to be trustworthy, and how toact on it, if at all.

In yet another embodiment, many points in the network can be observed orsampled by one packet traffic monitor or by a set of co-operating packettraffic monitors. The packet traffic monitor may be able to detectundesirable packet traffic patterns or usage of network resources thatare not observable from the vantage of one point alone. For example, amonitor may be able to detect that a large number of hosts are “gangingup” on some other host, even though, individually, no single host isoverloading the victim host.

Ideally, the packet traffic monitor would cause only the undesirablepackets to be discarded and as close to their source as possible. Whenthis is not possible, it may be necessary to discard more packets andperhaps further from the source. Since the packet traffic monitor maynot in all cases be associated with the faulty component it may not beable to shut down that component directly. Instead, the packet trafficmonitor resorts to switch(es) in the segment where the faulty host isattached. For example, the packet traffic monitor could ask the switchdirectly connected to the faulty host to stop forwarding particularpackets from that host, all packets from that host, or perhaps allpackets from the network segment to which that host is attached. Thedescription of what packets should be discarded may not need to bedirectly related to the description of packets that were originallydetected by the packet traffic monitor.

An alternative to discarding (or shutting off) undesirable packets atthe source would be to discard them near their destination, if this iseasier. In addition to shutting off the undesirable packets, the monitormay also notify an administrator of the action that had been taken.

CONCLUSION

The packet traffic monitor is able to recover the topology of a network,such as the switched Ethernet network, using commonly availableinformation from SNMP (simple network management protocol) or any othersuitable network management protocol. This feature makes it particularlyuseful to network administrators interested in planning improvements tothe network infrastructure.

With this understanding of host locations, it becomes possible todisable the ports of switches that connect to offending hosts. Thepacket traffic monitor disables some misbehaving hosts, and reportsmisbehavior to the network administrator. By disconnecting offendinghosts, it is possible to preserve connectivity between“correctly-behaving” hosts.

Although the present invention has been described in accordance with theembodiments shown, variations to the embodiments would be apparent tothose skilled in the art and those variations would be within the scopeand spirit of the present invention. Accordingly, it is intended thatthe specification and embodiments shown be considered as exemplary only,with a true scope of the invention being indicated by the followingclaims and equivalents.

1. A method for limiting the impact of undesirable behavior of computerson a network through which packets of data are interchanged between thecomputers, comprising: monitoring the network for any patterns ofbehavior; determining, upon discovering that one or more of the patternsof behavior is undesirable, a type of the undesirable pattern ofbehavior; determining a proper action for mitigating that type ofundesirable behavior, the proper action including preventingdissemination through the network of packets associated with theundesirable behavior and allowing dissemination of packets notassociated with the undesirable behavior, wherein preventingdissemination comprises at least one of changing a routing table,changing a forwarding table, turning off at least one port of aforwarding device, filtering on Internet Protocol (IP) addresses, andfiltering on media access control (MAC) addresses, and wherein adiscovery, including that of a network topology, facilitates the networkmonitoring and type of undesirable behavior determination, and whereinthe dissemination through the network of packets associated with theundesirable behavior is prevented for a time period that is lengthenedgradually as long as the undesirable behavior continues orintermittently reappears, the time period being gradually shortened ifthe undesirable behavior stops for a predetermined time.
 2. The methodof claim 1 wherein the time period corresponds to a skepticism levelthat depends on a history of the undesirable pattern of behavior, askepticism level zero (0) denoting a good history.
 3. The method ofclaim 1, wherein the undesirable pattern of behavior is characterized inthat it matches behavior defined by a network administrator as notableor undesirable.
 4. The method of claim 1, wherein the undesirablepattern of behavior is any network pathology characterized as abroadcast storm or an address resolution protocol (ARP) fight.
 5. Themethod of claim 1, wherein the undesirable pattern of behavior includesany one or more of a stolen Internet protocol (IP) address, a stolenmedia access control (MAC) address, a malformed packet, too many packetsdirected to an overloaded server, too many probe packets directed to afirewall or too many ARP request packets.
 6. The method of claim 1,wherein the undesirable pattern of behavior is a broadcast storm, andwherein the monitoring includes recovering a topology of the networkusing information obtained through a network management protocolinterface, and learning historical packet traffic statistics for anysegment of the network.
 7. The method of claim 6, wherein the networkmanagement protocol is the simple network management protocol (SNMP). 8.The method of claim 1, wherein the undesirable pattern of behavior is abroadcast storm, and wherein the monitoring includes learning a topologyof the network from a forwarding database or table of a forwardingdevice in the network.
 9. The method of the claim 1, wherein the networkis a shared data network.
 10. The method of claim 8, wherein the networkis a switched Ethernet network and the forwarding device is a switch.11. The method of claim 8, wherein the network is a bridged Ethernetnetwork and the forwarding device is a bridge or a smart bridge.
 12. Themethod of the claim 1, wherein the undesirable pattern of behavior istoo many ARP requests and wherein the monitoring includes verifyingstability and lack of conflicts in an IP or MAC address mapping.
 13. Themethod of the claim 1, wherein the proper action further includesalerting a system administrator about the existence of the undesirablepattern of behavior.
 14. The method of claim 1, wherein the undesirablepattern of behavior is a simultaneous use of a network address, andwherein the proper action includes disabling any address associated tothe network address that contradicts an address list in a network serveror disabling any associated address that is not included in a list ofaddresses that are allowed to map to the network address.
 15. The methodof claim 1 wherein discovery of the network topology facilitatesdisablement of ports in forwarding devices that connect to offendingcomputers.
 16. The method of claim 1 wherein the time period becomeslonger in a random exponential backoff before an attempt is made toallow resumption of the packets from any offending computer thatoriginated the undesirable pattern of behavior, the time period becominglonger if the undesirable pattern of behavior reoccurs during a currentbackoff time, the time period becoming shorter if the undesirablepattern of behavior disappears and does not reoccur in the currentbackoff time.
 17. A system for limiting the impact of undesirablebehavior of computers on a network through which packets of data areinterchanged between the computers, comprising: means for monitoring thepackets for any patterns of behavior; means for determining, upondiscovering that one or more of the patterns of behavior is undesirable,a type of the undesirable pattern of behavior; means for determining aproper action for mitigating that type of undesirable behavior, theproper action, performed by mitigation means, including preventingdissemination through the network of packets associated with theundesirable behavior and allowing dissemination of packets notassociated with the undesirable behavior, wherein preventingdissemination comprises at least one of changing a routing table,changing a forwarding table, and turning off at least one port of aforwarding device, and wherein means for discovery, including that of anetwork topology, facilitates network monitoring and type of undesirablebehavior determination, and wherein the dissemination through thenetwork of packets associated with the undesirable behavior is preventedfor a time period that is lengthened gradually as long as theundesirable behavior continues or intermittently reappears, the timeperiod being gradually shortened if the undesirable behavior stops for apredetermined time.
 18. The system of claim 17, wherein the time periodcorresponds to a skepticism level that depends on a history of theundesirable pattern of behavior, a skepticism level zero (0) denoting agood history.
 19. The system of claim 17, wherein the undesirablepattern of behavior is characterized in that it matches behavior definedby a network administrator as notable or undesirable.
 20. The system ofclaim 17, wherein the undesirable pattern of behavior is any networkpathology characterized as a broadcast storm or an address resolutionprotocol (ARP) fight.
 21. The system of claim 17, wherein theundesirable pattern of behavior includes any one or more of a stolenInternet protocol (IP) address, a stolen media access control (MAC)address, a malformed packet, too many packets directed to an overloadedserver, too many probe packets directed to a firewall or too many ARPrequest packets.
 22. The system of claim 17, wherein preventing thedissemination of Packets associated with the undesirable pattern ofbehavior includes discarding the packets associated with such behavior,isolating any of the computers at which such behavior originates, orisolating any network segments at which such behavior originates. 23.The system of claim 17, wherein the undesirable pattern of behavior is abroadcast storm, and wherein the monitoring means includes means forrecovering a topology of the network using information obtained througha standard SNMP (simple network management protocol) interface, andmeans for learning historical packet traffic statistics for any segmentof the network.
 24. The system of claim 17, wherein the undesirablepattern of behavior is a broadcast storm, and wherein the monitoringmeans includes means for learning the topology of the network from aforwarding database or table of a forwarding device in the network. 25.The system of claim 24, wherein the network is a switched Ethernetnetwork and the forwarding device is a switch.
 26. The system of claim17, wherein the network is a shared data network.
 27. The system ofclaim 17, wherein the undesirable pattern of behavior is too many ARPrequests and wherein the monitoring means includes means for verifyingstability and lack of conflicts in an IP or MAC address mapping.
 28. Thesystem of claim 17 wherein the proper action includes alerting a systemadministrator about the existence of the undesirable pattern ofbehavior.
 29. The system of claim 17, wherein the undesirable pattern ofbehavior is a simultaneous use of a network address, and wherein theproper action includes disabling any address associated to the networkaddress that contradicts an address list in a network server ordisabling any associated address that is not included in a list ofaddresses that are allowed to map to the network address.
 30. The systemof claim 17, wherein discovery of the network topology facilitatesdisablement of ports in forwarding devices that connect to offendingcomputers.
 31. The system of claim 17, wherein the time period becomeslonger in a random exponential backoff before an attempt is made toallow resumption of the packets from any offending computer thatoriginated the undesirable pattern of behavior, the time period becominglonger if the undesirable pattern of behavior reoccurs during a currentbackoff time, the time period becoming shorter if the undesirablepattern of behavior disappears and does not reoccur in the currentbackoff time.
 32. The method of claim 17, wherein the disseminationthrough the network of packets associated with the undesirable behavioris prevented for a time period that is exponentially increasing as longas the undesirable behavior continues or intermittently reappears, thetime period being exponentially shortened if the undesirable behaviorstops for a predetermined time.